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Qh! Abstract 

6 , 

This paper proposes a statistical mechanics approach to the analysis of income 
distribution and inequality. A new distribution function, having its roots in the 
q ■ framework of k- generalized statistics, is derived that is particularly suitable to de- 

scribe the whole spectrum of incomes, from the low-middle income region up to the 
high-income Pareto power-law regime. Analytical expressions for the shape, mo- 
ments and some other basic statistical properties are given. Furthermore, several 
well-known econometric tools for measuring inequality, which all exist in a closed 
form, are considered. A method for parameter estimation is also discussed. The 
■ model is shown to fit remarkably well the data on personal income for the United 

States, and the analysis of inequality performed in terms of its parameters reveals 
q I very powerful. 

q 
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1 Introduction 



Measurement of income inequality to evaluate social welfare is of particular 
interest to economics. Since the size distribution of income is the basis of 
inequality measures, correct specification of the income density function is 
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of great importance. The study of the income size distribution has a long 
history. Pareto pQ apparently was responsible for the first attempt at defining 
a general "law" that tried to explain the regularities of observed distributions. 
Let P> (x) be the percentage of individuals with incomes greater than or equal 
to x. Then, the (strong) Pareto law asserts that 

(x/xqY 01 when x < x < oo ... 
when x < Xq 

for some xq, a > and the support of P> (x) is [xq, +oo). 

Available empirical work leaves little doubt that Pareto law, as it stands, 
does not account satisfactorily for a wide range of incomes. Subsequently, 
the use of other density functions to model the income distribution, such 
as the lognormal [2] or gamma [3], has been advocated. However, rapidly 
accruing evidence showed that the lognormal and gamma distributions fit the 
data relatively well in the middle range of income but tend to exaggerate 
the skewness and perform poorly in the upper end [I]. Furthermore, if one's 
attention is restricted to the upper tail of the distributions, the evidence does 
not contradict the (strong) Pareto law, provided that the chosen xq is large 
enough. This suggests that observed distributions obey a weak version of the 
Pareto law [5], i.e. 

1 m — = 1 2 

™ (x/x )~ a y } 

for P> (x) with support [a, +oo) and a > 0, and some well-known density 
functions that have been proposed and implemented in the literature asymp- 
totically approach (rather than coincide with) the Pareto distribution. Among 
these, the Singh-Maddala [6] and Dagum [7] distributions have shown them 
to be a good compromise between parsimony and goodness-of-fit in many 
instances. 



Distributions exhibiting Pareto fat tails have been observed experimentally 
also in physical statistical systems. Since they differs from the ordinary ex- 
ponential distributions, this fact needs a theoretical explanation. In the last 
few decades several physical mechanisms have been considered in order to 
justify the non-exponential equilibrium distributions. For instance, deviations 
from the exponential distribution can be originated by quantum effects j8] or 
by anomalous diffusion which introduces nonlinearities in the particle kinetics 
both in the Fokker-Planck [9] and in the Boltzmann picture [10] of the system. 

In physics, the deviation of the distribution function from the exponential 
distribution, i.e. the power-law tails, presents at high energies. Then the rela- 
tivistic origin of this effect appears as the more natural. Recently, a statistical 
distribution based on the following one-parameter deformation of the expo- 
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nential function 

exp K (x) = (vT+ k 2 x 2 + kxJ , (3) 

with x G R and k G [0, 1), has been proposed by one the authors [IT] . The 
^-exponential can be inverted easily and the /t-logarithm is defined by 

ln K (*) = (4) 

with x > and k G [0, 1). 

The mechanism generating the latter deformation is originated by the micro- 
scopic Einstein relativistic dynamics [12] and for the deformation parameter 
it results k oc 1/c, being c the light speed. The value of k ^ is due to the 
finite value of the light speed and the deformation is originated ultimately by 
the Lorentz transformations. 

In order to better explain how the special relativity conditioned the form of the 
^-exponential function we recall that the relativistic momenta x and y of two 
identical particles A and B which move in the same direction, if observed in the 
rest frame of the particle B becomes x = x ffi (— y) and y = respectively. 

K 

The relativistic composition law © for the dimensionless momenta, according 
to the Lorentz transformations, is a generalized sum defined through 



x © y = xJ 1 + n 2 y 2 + yy/l + k 2 x 2 . (5) 
The K-exponential satisfies the functional equation 

exp K (x®y) = exp K (x) exp K (y) , (6) 



which, in the classical limit k — > 0, where exp K (x) — * exp (x) and x © y — > 
x + y, reduces to the classical equation exp(x + y) — exp(x)exp(y) of the 
ordinary exponential function. 

The relativistic sum defined in Equation (jSj) induces a relativistic general- 
ized mathematics where all the mathematical operators and functions emerge 
properly deformed. For instance the ordinary derivative operator transforms 
into the K-derivative given by 

d 



— = y/l + K 2 X 2 — . (7) 
a K x ax 

Within this theoretical framework the ^-exponential emerges as the relativistic 
generalization of the ordinary exponential. In particular it holds the relation- 
ship 

- — exp K (x) = exp K (x) , (8) 
(i K x 
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which is the relativistic generalization of the classical equation (d/dx) exp (x) = 
exp (x) involving the ordinary derivative and exponential. 



The ordinary exponential function exp(x) emerges both at low energies, being 



as well as when the deformation parameter re approaches zero, i.e. 
lim K ^ ex P K i x ) — ex P ( x ) ■ On the contrary, for high values of x the function 
exp K (x) presents power-law tails 



The statistical mechanics based on exp K (x) preserves the Legendre structures 
of the ordinary statistical mechanics and the underlying entropy is stable 
[13]. The relevant statistical distribution at low energies is just the Boltzmann 
distribution according to Equation (|U]), while at high energies presents presents 
power- law tails according to Equation fflOl) . 

The particularly interesting mathematical properties of the re-exponential per- 
mit us to see this function as a very flexible mathematical tool in order to 
study efficiently also non-physical systems. Indeed, in the literature this func- 
tion have been used extensively in several fields beyond the relativity, e.g. in 
dynamical systems at the edge of chaos, in fractal systems, in game theory, in 
error theory, in economics and so on. 

On the other hand, it is well known that the Einstein relativity has the same 
basis of the Galilei relativity of classical physics, except for the presence of 
an extra Einstein principle, asserting that the information propagates with a 
finite speed (re ^ 0) and not instantaneously (re = 0) as professed in classical 
physics. This so natural relativistic principle relegates the ordinary exponential 
at the status of an abstract and nonphysical function and legitimates the use 
of the function re-exponential in the analysis of real systems. 

In this paper we exploit the deformed exponential function as a functional 
relationship that is more flexible than the standard one to build statistical 
models by adapting it to the context of income size distribution. Using such a 
deformed exponential function is attractive because it allows one to statisti- 
cally describe the whole spectrum of the size distribution of incomes, ranging 
from the low region to the middle region, and up to the Pareto tail. The re- 
deformed statistical model leads to a more general formulation that contains 
both Pareto and stretched exponential distributions as limiting cases. 

The rest of the paper is organized as follows. In Section [21 we examine the 
theoretical properties of what we refer to as the re-generalized distribution and 




(9) 




±i/l« 



(10) 
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show how it is able to account for some basic stylized facts of personal income 
data, such as the weak Pareto law and possessing at least one interior mode. 
In Section [31 in order to test the performance of the proposed distribution, we 
provide an empirical application to the U.S. personal income data. The paper 
is concluded in Section HI 



2 The ^-generalized statistical distribution 

In view of their importance for the proposed statistical model, in the fol- 
lowing we firstly recall some basic mathematical properties of the K-deformed 
exponential and logarithm functions. Then we give formulas for the shape, mo- 
ments and standard tools for inequality measurement. These include, among 
others, the ubiquitous Lorenz curve and the associated Gini measure of income 
inequality. In addition, we also discuss a method for parameter estimation. 

2.1 The k- deformed exponential and logarithm functions 

The power-law asymptotic behavior of exp K (x) as given by Equation (FlUi) 
reappears also in the function ln K (x) , namely 

\n K (x) ~ -^3T N (11a) 

a;->0+ 2\K\ 

and 

In* Or) ~ TT^K (lib) 

x^+oo 2 I K | 

Like the ordinary functions, also the deformed ones have the properties 

exp K (x) exp K (-x) = 1, (12a) 
ln K (1/x) = - ln K (x) (12b) 

and 

[exp K (x)] r = exp K/r (rx) , (13a) 
ln K (x r ) = rln rK (x) . (13b) 

The Taylor expansions of the functions exp K (x) and ln K (x) are given by 

2 3 

exp K (x) = 1 + x + y + (l - k 2 ) |y + . . . (14a) 
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and 



X 2 ( K 2 \ X 3 



\n K (1 + x) = x - - + 1 + - I - - . . . , (14b) 



respectively, and hold for x — > 0. 

2.2 The distribution and its properties 



In the last few years the ^-exponential function was adopted successfully to 
analyze also non-physical systems, including economic systems. In particular, 
the K-deformation has been employed in order to propose the so-called K- 
deformed multinomial logit model to study differentiated product markets [TJ] 
and to model the personal income distribution [T3]. In this latter application 
the distribution function was defined through 

P> (a;) = exp K (-(3x a ) , (15) 

where x G R, a,/3 > and k G [0, 1). The income variable x is defined as 
x — zj (z), being z the absolute personal income and (z) its mean value. The 
corresponding density reads 

. , a3x a ~ x exp^ (— 3x a ) , , 

f(x)= -vr+M^ • (16) 

while the quantile function is available in the following closed form 

x(u) = f3~ 1/a l-^ K (l-u)] 1/a , (17) 
with u = P< (x) = 1 — P> (x) and < u < 1. 

As k — > this model tends to the stretched exponential distribution; it can 
be easily verified that 

lim P> (x) = exp {-(3x a ) (18a) 

and 

lim p (x) = af3x a ~ l exp (-(3x a ) . (18b) 

For low incomes (x — > 0) the distribution behaves similarly to the stretched 
exponential Equation (I18al) and Equation f!18bj) . while at high incomes it ap- 
proaches a Pareto distribution with scale (2/3/t)~ 1 ^ Q and shape a/n } i.e. 

P>(x) ~ (2pKy 1/K x~ a/K (19a) 

and 



x— >+oo 



p(x) ~ -(2(3k)- 1/k x-^ +1 \ (19b) 

x— >+oo K 
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thus satisfying the weak Pareto law [16] 

xp (x) 



lim 



a 



which is a rephrased version of Equation (121) . 



(20) 



From Equation (fTTl) we easily determine that the median of the distribution 
is 

x med = (3- 1 ^[\n K (2)]^. (21) 



The mode is at 



•^mode (3 



a 2 + 2k 2 (a - 1) 
2k 2 (a 2 — k 2 ) 



An 2 (a 2 - k 2 ) (a - 1) 
1 + — 4 -J- - 1 



i 

■la 



(22) 



\\ [a 2 + 2k 2 {a - 1)] 
if a > 1; otherwise, the distribution is zero-modal with a pole at the origin. 



2.3 Moments and other basic properties 



The moment about zero of order r — 1 of exp K (— f3x a ), with < r < 1/k, can 
be obtained in closed form and is given by 



x r P> (x) d x 



i (2/3*Q-°rQ;-£) 

a 1 + -k r (— + — 

a L \2k t 2a 



(23) 



where T (•) denotes the gamma function. Therefore, the moment of or- 
der r expressed in terms of the density function Equation (fl6l) . i.e. \i r = 

r J °° x r_1 P> (x) d x = Jq°° x r p (x) d x, equals 



r (2/?k)~° 


r 




2a y 


a 1 + -k 

a 


r 




2a) 



(24) 



Specifically, = m is the mean of the distribution and the variance, a 2 
/i 2 — m 2 , is defined as 



a 2 = (2/3k)" < 



r(i + l)rfe-i 
1 + 2 ^ r(i + i) 



1 _y__aJ \2k 2a 

i + H r f -i- + -i- 



(25) 
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Hence, the coefficient of variation, CV K — a/m, equals 

CV K : 









\2k 




r 2 


\2k 




a + 2k 


r 2 




,2k 




r 2 


a 

V2k 


2a ,/ 



(26) 



It is also possible to define the standardized measures 71 
fii/cr 4 of skewness and kurtosis, respectively, given by 



ji 3 /a 3 and 72 



7i 



/i 3 — 3/i 2 m + 2m 3 



a" 



and 



72 



/i 4 — 4/i 3 m — 6/i 2 m 2 — 3m 4 



where 

i=o V/ 

is the moment about the mean of order r. 



(27) 



f28) 



(29) 



S.^ Lorenz curve and inequality measures 



For a discussion of income inequality, the standard practice adopts the concept 
of concentration of incomes as defined by Lorenz [T7]. The so-called Lorenz 
curve measures the cumulative fraction of population with incomes below x 
along the horizontal axis, and the fraction of the total income this population 
accounts for along the vertical axis. The points plotted for the various values 
of x trace out a curve below the 45° line sloping upwards to the right from 
the origin. 

In statistical terms, for any general distribution supported on the nonnegative 
half-line with a finite and positive first moment the Lorenz curve is available 
in terms of the first-moment distribution L{u) = m _1 J X x p (x j dx. Thus 
we have the Lorenz curve for the ^-generalized distribution as follows 



1 + g r + 27 

2r ft) r (£ - 



+B 



x 



1 

2k 




(30) 



where Bx (■, ■) is the incomplete beta function with X = (1 — u 



,2k 
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The related Gini coefficient of inequality [18] can be easily derived using its rep- 
resentation in terms of order statistics [19], i.e. G — 1 — mT 1 / °° [P> (x)] 2 dx; 
this yields 

1 2a + 2«r(i- j;)r(£ + £ 



G R 



\s T 2a/ \2k 2a/ 



(31) 



Furthermore, other summary inequality measures can be derived which are 
well-known and of widespread use in the econometric literature. For instance, 
in the context of the /t-deformed distribution the generalized entropy (GE) 
class of inequality measures [20] assumes the form 



GE K (9) 



9 2 -9 



m 



~ a r {h ~ 2^) 



-K 



r f — + — 



r i 



a 



- 1 



(32) 



with 9 ^ 0,1. Equation ( 1321) defines a class because the index GE K (9) assumes 
different forms depending on the value assigned to 9. From an operational 
point of view, two limiting cases of Equation (|32|) are of particular interest 
for inequality measurement: the mean logarithmic deviation index, MLD K = 
lim e _> GE K (9), given by 



MLD K 



1 

a 



7 + ip 



1 

2k 



In (2/3 k) + a In (m) + k 



(33) 



where 7 = — ip (1) is the Euler-Mascheroni constant and ip (z) = V (z) /V (z) is 
the digamma function, and the Theil [21] index, T K = lime^i GE K (9) , defined 

as 



T 



1 



t/J 1 + 



a 



-It 

2^ 



1 



1 



2k 2a 
— In (2(3k) — a In (m) — 



1 1 

2k 2a 



(34) 



a + k_ 

Other GE indexes often used in applied work are the bottom-sensitive index, 



1 r(i + ±)r(i-±) 

GE K (-1) = --+ L " ' ^ ^ 



(35) 



and the top-sensitive index (or half the squared coefficient of variation), 



GE K (2) = -CVl 



(36) 



Finally, the Atkinson index [22] for inequality aversion parameter 9 = 1 — e 
can be easily computed from GE K (9) by exploiting the relationship 

A K (e) = 1 - [e (e - 1) GE K (1 - e) + 1]^ , (37) 
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where 1. The limiting form as e — > 1 is 

A K (1) = 1 - exp {-MLD K ) 



(38) 



U. 5 Estimation 



Parameter estimation for the ^-generalized distribution can be performed us- 
ing the Maximum Likelihood (ML) approach. Assuming that all observations 
x = {xi, . . . , x n } are independent, the likelihood function is 

L (0; x) = f[ p ( Xi ) = (a/3)" ft X \ eXPK }~ff\ (39) 

f=l t=l + (3 2 K?X 2a 

where = {a, /3, k} is the parameter vector. This leads to the problem of solv- 
ing the partial derivatives of the log-likelihood function I (0; x) = InL (0; x) 
with respect to a, (3 and k. However, obtaining explicit expressions for the 
ML estimators of the three parameters is difficult, making direct analytical 
solutions intractable, and one needs to use numerical optimization methods. 

Taking into account the meaning of the variable x, the mean value results to 
be equal to unity, i.e. m = / °° xp(x) dx = 1. The latter relationship permits 
to express the parameter (3 as a function of the parameters a and k, obtaining 



K + «r(- + — 1 

V 2k ~ 2a J . 



(40) 



In this way, the problem to determine the values of the free parameters 
{a, (3, k,} of the theory from the empirical data reduces to a two parameter 
{a, k} fitting problem. Therefore, to find the parameter values that give the 
most desirable fit, one can use the Constrained Maximum Likelihood (CML) 
estimation method [23], which solves the general maximum log-likelihood 
problem of the form Z(0;x) = X^=i ^ n P @) W % where n is the number of 
observations, Wi the weight assigned to each observation, p(xi]0) the prob- 
ability of Xi given 0, subject to the non-linear equality constraint given by 
Equation (PTOI) and bounds a,/3 > and k G [0, 1). The CML procedure finds 
values for the parameters in such that the negative of I (0; x) is minimized 
using the sequential quadratic programming method [21] as implemented, e.g., 
in Matlab® 7. 
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3 Empirical application to U.S. income data 



The ft-generalized distribution was fitted to data on personal income derived 
from the 2003 wave of the U.S. Panel Study of Income Dynamics (PSID) as 
released in the Cross-National Equivalent File (CNEF), a commercially avail- 
able database compiled by researchers at Cornell University [25]. The 2003 
PSID-CNEF data have a sampling of 7,822 household, and all calculations are 
based on the household post-government income — i.e. the income recorded af- 
ter taxes and government transfers — expressed in nominal local currency unit 
and normalized to its empirical average given by 31, 812.39 ± 598.74 USD. We 
have omitted from the sample of incomes those with zero and negative value, 
and this affected only a tiny fraction of the data. Furthermore, incomes have 
been adjusted for differences in household size by dividing by the square root 
of the number of household members and weighted by the provided sampling 
weights 



The best-fitting parameter values were determined using CML estimation as 
discussed in Section [231 This resulted in the following estimates: a = 1.91 15 ± 
0.0003, = 1.0568 ± 0.0002 and k = 0.6587 ± 0.0003. The very small value 
of the errors indicates that the parameters were precisely estimated, and the 



comparison between the observed and fitted probabilities in panels (a) and (b) 
of Figured] suggests that the ^-generalized distribution offers a great potential 
for describing the data over their whole range, from the low to medium income 
region through to the high income Pareto power-law regime, including the 
intermediate region for which a clear deviation exists when two different curves 
are used. 



Panel (c) of the same figure depicts the data points for the empirical Lorenz 
curve, i.e. L(i/n) = Z)}=i x jl X)?=i x j-> i = l;2,...,n, superimposed by the 
theoretical curve L K (u) given by Equation (1311 with estimates replacing a 
and k as necessary. This formula is shown by the solid line in the plot, and 
fits the data exceptionally well. The plot also compares the empirical Lorenz 
curve to the theoretical ones associated with the stretched exponential and 
Pareto distributions, respectively given by 

UmL K (u) = P (l + -,-ln(l - u)) , (41a) 
k-+o V a J 

where P (•, •) is the lower regularized incomplete gamma function, and 

lim L K (u) = 1 - (1 - u) 1 '* . (41b) 

x— >oo 

As one can easily recognize, these curves account for only a small part of the 
whole story. 

In order to provide indirect checks on the validity of the parameter estimation, 
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(a) Cumulative distribution 



(b) Density 




Fig. 1. The mean-rescaled U.S. personal income distribution in 2003. (a) Empirical 
cumulative distribution in the log-log scale. The solid line is our theoretical model 
given by Equation (|15j) fitting very well the data in the whole range from the low to 
the high incomes including the intermediate income region. This function is com- 
pared with the ordinary stretched exponential one (dotted line) — fitting the low 
income data — and with the pure power-law (dashed line) — fitting the high income 
data, (b) Probability density histogram with superimposed fits of the K-generalized 
(solid line) and Weibull (dotted line) densities, (c) Lorenz curve. The hollow circles 
represent the empirical data points and the solid line is the theoretical curve given 
by Equation ([30]) using the same parameter values as in panels (a) and |(b)| The 
dash-dot line corresponds to the Lorenz curve of a society in which everybody re- 
ceives the same income and thus serves as a benchmark case against which actual 
income distribution may be measured. The dotted and dashed lines represent the 
theoretical Lorenz curves from the stretched exponential and Pareto distributions 
given by Equations Equation (|41ap and Equation (|41bp . respectively. |(d)| Q-Q plot 
of the sample quantiles versus the corresponding quantiles of the fitted K-generalized 
(hollow circles), stretched exponential (dotted line) and Pareto (dashed line) distri- 
butions. Where not displayed, the quantiles of these last two distributions coincide 
with those of the At-generalized. The reference (solid) line has been obtained by lo- 
cating points on the plot corresponding to around the 25 th and 75 th percentiles and 
connecting these two. In panels |(a)| |(b)| and |(d)| the income axis limits have been 
adjusted according to the range of data to shed light on the intermediate region 
between the bulk and the upper end of the distribution. 



12 



we have also calculated the sample values of the Gini and Theil indexes, ob- 
tained respectively as G = n~ 2 Yh=i (2i — n — 1) Xi and T = n~ l Yh=i x % hi ( x i), 
which return G = 0.3805±0.0092 and T = 0.2790±0.0295. The corresponding 
predictions from the analytical expressions Equation (l3"Tj) and Equation (JMJ) 
are G K = 0.3780 and T K = 0.2600, and result completely covered by the 95% 
confidence intervals constructed around the empirical values!"*! 

The accuracy of our distributional model was further examined by testing 
the hypothesis that the observed data follow a ^-generalized distribution 
through the Kolmogorov-Smirnov (K-S) goodness-of-fit test statistic given 
by D + = max!<j< n [in~ l — P< i = 1,2, ... ,n. Since in this case there 
is no asymptotic formula for calculating the p-value, we have reduced the 
problem to testing that the x values have a standard exponential distribu- 
tion (i.e., an exponential distribution with parameter equal to 1) by relat- 
ing the function P> (x) given by Equation (fT5]) to the ordinary exponen- 
tial function, namely exp K (— (3x a ) = exp (— x K ), through the transformation 
x K = k" 1 log + f3 2 K 2 x 2a + (3Kx a ^j , where the parameters are estimated 
from the data. Thus the significance level in the upper tail is given approxi- 
matively by P> (T*) = exp|-2 (T*) 2 ] , with T* = D + (^/n + 0.12 + 0.11/^) 
[28]. The results are D + = 0.0085 and P> (T*) = 0.3263, and state that the 
maximum distance between the empirical data and the theoretical model as 
assessed by the K-S statistic is so small that the p-value is not able to lead to 
rejection of the null hypothesis that the data may come from a ^-generalized 
distribution at any of the usual significance levels (1%, 5% and 10%). The lin- 
ear behavior emerging from the Quantile-Quantile (Q-Q) plot of the sample 
quantiles versus the corresponding quantiles of the fitted ^-generalized distri- 



bution and its two limiting cases displayed in panel (d) of Figure [T] confirms 
the quantitative results obtained by hypothesis testing, as well as the fact that 
the stretched exponential and Pareto distributions can give only a partial and 
incomplete description of the data. 



4 Concluding remarks 



Fitting a parametric model to income data can be a valuable and informative 
tool for distributional analysis. Not only can one summarize the information 
contained in thousands of observations, but also useful information can be 
drawn directly from the estimated parameters. For example one could be in- 
terested in measuring income inequality, comparing different distributions or 
elaborating income redistribution policy: these concepts may be directly de- 
rived from parameters of a fitted distribution. 

1 The confidence intervals for the observed Gini and Theil indexes have been cal- 
culated via the bootstrap resampling method based on 1000 replications [27] . 
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Starting from the Pareto contribution, a wide variety of functional forms have 
been considered as possible models for the distribution of personal income by 
size, and other approaches can no doubt be suggested and deserve attention. 

In this work we have proposed a new fitting function having its roots in the 
framework of the ^-generalized statistical mechanics. The model has a bulk 
very close to the stretched exponential one — which is recovered when the de- 
formation parameter k tends to zero — while for high values of income its upper 
tail approaches a Pareto distribution, thus being able to describe the data over 
the entire range. The performance of the distribution has been checked against 
real data on personal income for the United States in 2003 and has been found 
to fit remarkably well. The analysis of inequality performed in terms of its pa- 
rameters reveals the merit of the new proposed distribution, and provides the 
basis for a fruitful interaction between the two fields of statistical mechanics 
and economics. 

References 

[1] Pareto V 1895 Giorn. Econ. 10 59 English translation 1997 in Rivista Politica 
Econ. 87 693; Pareto V 1896 La courbe de la repartition de la richesse Reprinted 
1965 in (Eeuvres Completes de Vilfredo Pareto, Tome 3: Ecrits sur la Courbe 
de la Repartition de la Richesse ed G Busoni (Geneva: Librairie Droz) English 
translation 1997 in Rivista Politica Econ. 87 647; Pareto V 1897a Course 
d'Economie Politique (London: Macmillan); Pareto V 1897b Giorn. Econ. 14 
15 English translation 1997 in Rivista Politica Econ. 87 645. 

[2] Aitchison J and Brown J A C 1954 Metroecon. 6 81; Aitchison J and Brown 
J A C 1957 The Lognormal Distribution with Special Reference to its Use in 
Economics (New York: Cambridge University Press). 

[3] Salem A B Z and Mount T D 1974 Econometrica 42 1115. 

[4] McDonald J B and Ransom M R 1979 Econometrica 47 1513; McDonald J B 
1984 Econometrica 52 647. 

[5] Mandelbrot B 1960 Int. Econ. Rev. 1 79. 

[6] Singh S K and Maddala G S 1976 Econometrica 44 963. 

[7] Dagum C 1977 Econ. Appl. 30 413. 

[8] Kaniadakis G, Lavagno A and Quarati P 1996 Nucl. Phys. B 466 527; 
Kaniadakis G, Lavagno A and Quarati P 1997 Phys. Lett. A 227 227. 

[9] Kaniadakis G and Quarati P 1997 Physica A 237 229; Kaniadakis G and 
Lapenta G 2000 Phys. Rev. E 62 3246. 

[10] Biro T S and Kaniadakis G 2006 Eur. Phys. J. B 50 3. 



14 



[11] Kaniadakis G 2001a Physica A 296 405; Kaniadakis G 2001b Phys. Lett. A 288 
283. 

[12] Kaniadakis G 2002 Phys. Rev. E 66 056125; Kaniadakis G 2005 Phys. Rev. E 
72 036108. 

[13] Kaniadakis G and Scarfone A M 2004 Physica A 340 102; Abe S, Kaniadakis 
G and Scarfone A M 2004 J. Phys. A: Math. Gen. 37 10513. 

[14] Rajaoarison D, Bolduc D and Jayet H 2006 Econ. Lett. 86 13; Rajaoarison D 
2008 Econ. Lett. 100 396. 

[15] Clementi F, Gallegati M and Kaniadakis G 2007 Eur. Phys. J. B 57 187; 
Clementi F, Di Matteo T, Gallegati M and Kaniadakis G 2008 Physica A 387 
3201. 

[16] Kakwani N 1980 Income Inequality and Poverty: Methods of Estimation and 
Policy Applications (New York: Oxford University Press). 

[17] Lorenz M O 1905 Pub. Am. Stat. Assn. 9 209. 

[18] Gini C 1914 Atti del Reale Istituto Veneto di Scienze, Lettere ed Arti 73 1203 
English translation 2005 in Metron 63 3. 

[19] Arnold B C and Laguna L 1977 On Generalized Pareto Distributions with 
Applications to Income Data (Ames: Iowa State University Press). 

[20] Cowell F A 1980a Europ. Econ. Rev. 13 147; Cowell F A 1980b Rev. Econ. 
Stud. 47 521; Cowell F A and Kuga K 1981a J. Econ. Theory 25 131; Cowell 
F A and Kuga K 1981b Europ. Econ. Rev. 15 287; Cowell F A 1995 Measuring 
Inequality (Hemel Hempstead: Prentice Hall/Harvester Wheatsheaf). 

[21] TheilH 1967 Economics and Information Theory (Amsterdam: North-Holland). 

[22] Atkinson A B 1970 J. Econ. Theory 2 244. 

[23] Schoenberg R 1997 Computational Econ. 10 251. 

[24] Han S P 1977 J. Optimiz. Theory App. 22 297. 

[25] Burkhauser R V, Butrica B A, Daly M C and Lillard D R 2001 The Cross- 
National Equivalent File: A product of cross-national research Soziale Sicherung 
in einer dynamischen Gesellschaft. Festschrift fur Richard Hauser zum 65 
( Social Insurance in a Dynamic Society. Papers in Honor of the 65 th Birthday 
of Richard Hauser) eds I Becker et al. (Frankfurt and New York: Geburtstag 
Campus) p 354. 

[26] Deaton A 1996 The Analysis of Household Surveys: A Microeconometric 
Approach to Development Policy (Baltimore, MD: Johns Hopkins University 
Press). 

[27] Mills J A and Zandvakili S 1997 J. Appl. Econometrics 12 133. 
[28] Stephens M A 1970 J. R. Stat. Soc. B Met. 32 115. 



15 



